Dataset statistics
| Number of variables | 20 |
|---|---|
| Number of observations | 10000 |
| Missing cells | 55130 |
| Missing cells (%) | 27.6% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 4.3 MiB |
| Average record size in memory | 453.1 B |
Variable types
| NUM | 9 |
|---|---|
| CAT | 6 |
| BOOL | 5 |
Reproduction
| Analysis started | 2020-07-18 07:41:38.125845 |
|---|---|
| Analysis finished | 2020-07-18 07:42:12.125508 |
| Duration | 34 seconds |
| Version | pandas-profiling v2.7.1 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
person has a high cardinality: 7144 distinct values | High cardinality |
reward is highly correlated with reward_expected | High correlation |
reward_expected is highly correlated with reward | High correlation |
reward_expected is highly correlated with offer_id | High correlation |
offer_id is highly correlated with reward_expected and 1 other fields | High correlation |
offer_type is highly correlated with offer_id | High correlation |
offer_id has 4539 (45.4%) missing values | Missing |
amount has 5461 (54.6%) missing values | Missing |
reward_expected has 8818 (88.2%) missing values | Missing |
reward has 4539 (45.4%) missing values | Missing |
difficulty has 4539 (45.4%) missing values | Missing |
duration has 4539 (45.4%) missing values | Missing |
offer_type has 4539 (45.4%) missing values | Missing |
web has 4539 (45.4%) missing values | Missing |
email has 4539 (45.4%) missing values | Missing |
mobile has 4539 (45.4%) missing values | Missing |
social has 4539 (45.4%) missing values | Missing |
person is uniformly distributed | Uniform |
df_index has unique values | Unique |
time has 459 (4.6%) zeros | Zeros |
reward has 884 (8.8%) zeros | Zeros |
difficulty has 884 (8.8%) zeros | Zeros |
| Distinct count | 10000 |
|---|---|
| Unique (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 136634.7216 |
|---|---|
| Minimum | 13 |
| Maximum | 272744 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 13 |
|---|---|
| 5-th percentile | 13679.45 |
| Q1 | 68710 |
| median | 135995.5 |
| Q3 | 204147.75 |
| 95-th percentile | 258374.25 |
| Maximum | 272744 |
| Range | 272731 |
| Interquartile range (IQR) | 135437.75 |
Descriptive statistics
| Standard deviation | 78276.22095 |
|---|---|
| Coefficient of variation (CV) | 0.5728867453 |
| Kurtosis | -1.187308503 |
| Mean | 136634.7216 |
| Median Absolute Deviation (MAD) | 67725 |
| Skewness | -0.004384831313 |
| Sum | 1366347216 |
| Variance | 6127166767 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 179270 | 1 | < 0.1% | |
| 204177 | 1 | < 0.1% | |
| 101182 | 1 | < 0.1% | |
| 56729 | 1 | < 0.1% | |
| 152984 | 1 | < 0.1% | |
| 93591 | 1 | < 0.1% | |
| 54083 | 1 | < 0.1% | |
| 46484 | 1 | < 0.1% | |
| 34194 | 1 | < 0.1% | |
| 87440 | 1 | < 0.1% | |
| Other values (9990) | 9990 | 99.9% |
| Value | Count | Frequency (%) | |
| 13 | 1 | < 0.1% | |
| 46 | 1 | < 0.1% | |
| 134 | 1 | < 0.1% | |
| 163 | 1 | < 0.1% | |
| 193 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 272744 | 1 | < 0.1% | |
| 272720 | 1 | < 0.1% | |
| 272718 | 1 | < 0.1% | |
| 272683 | 1 | < 0.1% | |
| 272641 | 1 | < 0.1% |
| Distinct count | 7144 |
|---|---|
| Unique (%) | 71.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 78.2 KiB |
| 99d0d659823e4fe5a2b9837140fbcc64 | 7 |
|---|---|
| a3d2267a2d1f4eaabd6869c07b03cdf6 | 6 |
| 83998670898c4604af38d23e74e52269 | 6 |
| aba68a20aa8a42acbe25ebcf0800414b | 6 |
| 60ce7058fd174a128b48371bd1dbd61b | 5 |
| Other values (7139) |
| Value | Count | Frequency (%) | |
| 99d0d659823e4fe5a2b9837140fbcc64 | 7 | 0.1% | |
| a3d2267a2d1f4eaabd6869c07b03cdf6 | 6 | 0.1% | |
| 83998670898c4604af38d23e74e52269 | 6 | 0.1% | |
| aba68a20aa8a42acbe25ebcf0800414b | 6 | 0.1% | |
| 60ce7058fd174a128b48371bd1dbd61b | 5 | 0.1% | |
| 5af71c1246834a02b6d671e3b93f8695 | 5 | 0.1% | |
| 8b6427bb8a2a423f92aeb27e22345449 | 5 | 0.1% | |
| 79d9d4f86aca4bed9290350fb43817c2 | 5 | 0.1% | |
| 3dfb16c7592048699d72aba5307fc67d | 5 | 0.1% | |
| 1a79f623402a4e53908e254531e04f26 | 5 | 0.1% | |
| Other values (7134) | 9945 | 99.5% |
Length
| Max length | 32 |
|---|---|
| Mean length | 32 |
| Min length | 32 |
| Value | Count | Frequency (%) | |
| Decimal_Number | 10 | 62.5% | |
| Lowercase_Letter | 6 | 37.5% |
| Value | Count | Frequency (%) | |
| Common | 10 | 62.5% | |
| Latin | 6 | 37.5% |
| Value | Count | Frequency (%) | |
| ASCII | 16 | 100.0% |
event
Categorical
| Distinct count | 4 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 78.2 KiB |
| transaction | |
|---|---|
| offer received | |
| offer viewed | |
| offer completed |
| Value | Count | Frequency (%) | |
| transaction | 4539 | 45.4% | |
| offer received | 2393 | 23.9% | |
| offer viewed | 1886 | 18.9% | |
| offer completed | 1182 | 11.8% |
Length
| Max length | 15 |
|---|---|
| Mean length | 12.3793 |
| Min length | 11 |
| Value | Count | Frequency (%) | |
| Lowercase_Letter | 16 | 94.1% | |
| Space_Separator | 1 | 5.9% |
| Value | Count | Frequency (%) | |
| Latin | 16 | 94.1% | |
| Common | 1 | 5.9% |
| Value | Count | Frequency (%) | |
| ASCII | 17 | 100.0% |
| Distinct count | 120 |
|---|---|
| Unique (%) | 1.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 365.13 |
|---|---|
| Minimum | 0 |
| Maximum | 714 |
| Zeros | 459 |
| Zeros (%) | 4.6% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 6 |
| Q1 | 186 |
| median | 408 |
| Q3 | 528 |
| 95-th percentile | 648 |
| Maximum | 714 |
| Range | 714 |
| Interquartile range (IQR) | 342 |
Descriptive statistics
| Standard deviation | 200.1453815 |
|---|---|
| Coefficient of variation (CV) | 0.5481482799 |
| Kurtosis | -1.03883509 |
| Mean | 365.13 |
| Median Absolute Deviation (MAD) | 168 |
| Skewness | -0.30449619 |
| Sum | 3651300 |
| Variance | 40058.17372 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 408 | 589 | 5.9% | |
| 504 | 554 | 5.5% | |
| 576 | 546 | 5.5% | |
| 168 | 499 | 5.0% | |
| 336 | 482 | 4.8% | |
| 0 | 459 | 4.6% | |
| 516 | 116 | 1.2% | |
| 582 | 115 | 1.1% | |
| 510 | 110 | 1.1% | |
| 594 | 107 | 1.1% | |
| Other values (110) | 6423 | 64.2% |
| Value | Count | Frequency (%) | |
| 0 | 459 | 4.6% | |
| 6 | 82 | 0.8% | |
| 12 | 69 | 0.7% | |
| 18 | 75 | 0.8% | |
| 24 | 67 | 0.7% |
| Value | Count | Frequency (%) | |
| 714 | 32 | 0.3% | |
| 708 | 35 | 0.4% | |
| 702 | 37 | 0.4% | |
| 696 | 43 | 0.4% | |
| 690 | 28 | 0.3% |
| Distinct count | 10 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 4539 |
| Missing (%) | 45.4% |
| Memory size | 78.2 KiB |
| 2298d6c36e964ae4a3e7e9706d1fb8c2 | |
|---|---|
| f19421c1d4aa40978ebb69ca19b0e20d | |
| fafdcd668e3743c1bb461111dcafc2a4 | |
| ae264e3637204a6fb9bb56bc8210ddfd | |
| 4d5c57ea9a6940dd891ad53e9dbe8da0 | |
| Other values (5) |
| Value | Count | Frequency (%) | |
| 2298d6c36e964ae4a3e7e9706d1fb8c2 | 645 | 6.5% | |
| f19421c1d4aa40978ebb69ca19b0e20d | 634 | 6.3% | |
| fafdcd668e3743c1bb461111dcafc2a4 | 615 | 6.2% | |
| ae264e3637204a6fb9bb56bc8210ddfd | 608 | 6.1% | |
| 4d5c57ea9a6940dd891ad53e9dbe8da0 | 555 | 5.5% | |
| 2906b810c7d4411798c6938adc9daaa5 | 529 | 5.3% | |
| 9b98b8c7a33c4b65b9aebfe6a799e6d9 | 524 | 5.2% | |
| 5a8bc65990b245e5a138643cd4eb9837 | 478 | 4.8% | |
| 0b1e1539f2cc45b7b9fa7c272da2e1d7 | 467 | 4.7% | |
| 3f207df678b143eea3cee63160fa8bed | 406 | 4.1% | |
| (Missing) | 4539 | 45.4% |
Length
| Max length | 32 |
|---|---|
| Mean length | 18.8369 |
| Min length | 3 |
| Value | Count | Frequency (%) | |
| Decimal_Number | 10 | 58.8% | |
| Lowercase_Letter | 7 | 41.2% |
| Value | Count | Frequency (%) | |
| Common | 10 | 58.8% | |
| Latin | 7 | 41.2% |
| Value | Count | Frequency (%) | |
| ASCII | 17 | 100.0% |
| Distinct count | 2303 |
|---|---|
| Unique (%) | 50.7% |
| Missing | 5461 |
| Missing (%) | 54.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 13.682203128442389 |
|---|---|
| Minimum | 0.05 |
| Maximum | 943.33 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 0.05 |
|---|---|
| 5-th percentile | 0.85 |
| Q1 | 3.47 |
| median | 10.64 |
| Q3 | 18.655 |
| 95-th percentile | 29.638 |
| Maximum | 943.33 |
| Range | 943.28 |
| Interquartile range (IQR) | 15.185 |
Descriptive statistics
| Standard deviation | 31.44431525 |
|---|---|
| Coefficient of variation (CV) | 2.298190939 |
| Kurtosis | 425.8318363 |
| Mean | 13.68220313 |
| Median Absolute Deviation (MAD) | 7.46 |
| Skewness | 18.79678142 |
| Sum | 62103.52 |
| Variance | 988.7449617 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 2.18 | 10 | 0.1% | |
| 1.73 | 10 | 0.1% | |
| 1.83 | 9 | 0.1% | |
| 1.24 | 9 | 0.1% | |
| 1.35 | 8 | 0.1% | |
| 1.08 | 8 | 0.1% | |
| 1.76 | 7 | 0.1% | |
| 14.36 | 7 | 0.1% | |
| 8.98 | 7 | 0.1% | |
| 1.61 | 7 | 0.1% | |
| Other values (2293) | 4457 | 44.6% | |
| (Missing) | 5461 | 54.6% |
| Value | Count | Frequency (%) | |
| 0.05 | 3 | < 0.1% | |
| 0.07 | 3 | < 0.1% | |
| 0.08 | 3 | < 0.1% | |
| 0.09 | 1 | < 0.1% | |
| 0.1 | 2 | < 0.1% |
| Value | Count | Frequency (%) | |
| 943.33 | 1 | < 0.1% | |
| 779.29 | 1 | < 0.1% | |
| 747.74 | 1 | < 0.1% | |
| 644.85 | 1 | < 0.1% | |
| 632.31 | 1 | < 0.1% |
| Distinct count | 4 |
|---|---|
| Unique (%) | 0.3% |
| Missing | 8818 |
| Missing (%) | 88.2% |
| Memory size | 78.2 KiB |
| 5 | |
|---|---|
| 2 | |
| 10 | |
| 3 |
| Value | Count | Frequency (%) | |
| 5 | 452 | 4.5% | |
| 2 | 321 | 3.2% | |
| 10 | 251 | 2.5% | |
| 3 | 158 | 1.6% | |
| (Missing) | 8818 | 88.2% |
Length
| Max length | 4 |
|---|---|
| Mean length | 3.0251 |
| Min length | 3 |
| Value | Count | Frequency (%) | |
| Decimal_Number | 5 | 62.5% | |
| Lowercase_Letter | 2 | 25.0% | |
| Other_Punctuation | 1 | 12.5% |
| Value | Count | Frequency (%) | |
| Common | 6 | 75.0% | |
| Latin | 2 | 25.0% |
| Value | Count | Frequency (%) | |
| ASCII | 8 | 100.0% |
gender
Categorical
| Distinct count | 3 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 78.2 KiB |
| M | |
|---|---|
| F | |
| O | 156 |
| Value | Count | Frequency (%) | |
| M | 5668 | 56.7% | |
| F | 4176 | 41.8% | |
| O | 156 | 1.6% |
Length
| Max length | 1 |
|---|---|
| Mean length | 1 |
| Min length | 1 |
| Value | Count | Frequency (%) | |
| Uppercase_Letter | 3 | 100.0% |
| Value | Count | Frequency (%) | |
| Latin | 3 | 100.0% |
| Value | Count | Frequency (%) | |
| ASCII | 3 | 100.0% |
age
Real number (ℝ≥0)
| Distinct count | 84 |
|---|---|
| Unique (%) | 0.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 53.6093 |
|---|---|
| Minimum | 18 |
| Maximum | 101 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 18 |
|---|---|
| 5-th percentile | 23 |
| Q1 | 41 |
| median | 54 |
| Q3 | 66 |
| 95-th percentile | 82 |
| Maximum | 101 |
| Range | 83 |
| Interquartile range (IQR) | 25 |
Descriptive statistics
| Standard deviation | 17.48448524 |
|---|---|
| Coefficient of variation (CV) | 0.3261464941 |
| Kurtosis | -0.564895089 |
| Mean | 53.6093 |
| Median Absolute Deviation (MAD) | 12 |
| Skewness | -0.04294424373 |
| Sum | 536093 |
| Variance | 305.7072242 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 58 | 274 | 2.7% | |
| 53 | 249 | 2.5% | |
| 59 | 248 | 2.5% | |
| 54 | 243 | 2.4% | |
| 51 | 241 | 2.4% | |
| 52 | 238 | 2.4% | |
| 56 | 237 | 2.4% | |
| 57 | 235 | 2.4% | |
| 49 | 229 | 2.3% | |
| 64 | 224 | 2.2% | |
| Other values (74) | 7582 | 75.8% |
| Value | Count | Frequency (%) | |
| 18 | 49 | 0.5% | |
| 19 | 102 | 1.0% | |
| 20 | 85 | 0.9% | |
| 21 | 87 | 0.9% | |
| 22 | 119 | 1.2% |
| Value | Count | Frequency (%) | |
| 101 | 4 | < 0.1% | |
| 100 | 4 | < 0.1% | |
| 99 | 2 | < 0.1% | |
| 98 | 2 | < 0.1% | |
| 97 | 6 | 0.1% |
income
Real number (ℝ≥0)
| Distinct count | 91 |
|---|---|
| Unique (%) | 0.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 64118.8 |
|---|---|
| Minimum | 30000.0 |
| Maximum | 120000.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 30000 |
|---|---|
| 5-th percentile | 33000 |
| Q1 | 48000 |
| median | 62000 |
| Q3 | 77000 |
| 95-th percentile | 103000 |
| Maximum | 120000 |
| Range | 90000 |
| Interquartile range (IQR) | 29000 |
Descriptive statistics
| Standard deviation | 21220.9076 |
|---|---|
| Coefficient of variation (CV) | 0.3309623324 |
| Kurtosis | -0.4470723483 |
| Mean | 64118.8 |
| Median Absolute Deviation (MAD) | 15000 |
| Skewness | 0.4697450794 |
| Sum | 641188000 |
| Variance | 450326919.3 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 73000 | 226 | 2.3% | |
| 70000 | 209 | 2.1% | |
| 56000 | 207 | 2.1% | |
| 52000 | 202 | 2.0% | |
| 53000 | 199 | 2.0% | |
| 60000 | 198 | 2.0% | |
| 55000 | 197 | 2.0% | |
| 57000 | 197 | 2.0% | |
| 61000 | 197 | 2.0% | |
| 54000 | 196 | 2.0% | |
| Other values (81) | 7972 | 79.7% |
| Value | Count | Frequency (%) | |
| 30000 | 63 | 0.6% | |
| 31000 | 138 | 1.4% | |
| 32000 | 144 | 1.4% | |
| 33000 | 174 | 1.7% | |
| 34000 | 151 | 1.5% |
| Value | Count | Frequency (%) | |
| 120000 | 8 | 0.1% | |
| 119000 | 30 | 0.3% | |
| 118000 | 32 | 0.3% | |
| 117000 | 29 | 0.3% | |
| 116000 | 32 | 0.3% |
days_since_became_member
Real number (ℝ≥0)
| Distinct count | 1559 |
|---|---|
| Unique (%) | 15.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 589.0626 |
|---|---|
| Minimum | 0 |
| Maximum | 1822 |
| Zeros | 7 |
| Zeros (%) | 0.1% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 57 |
| Q1 | 247 |
| median | 479.5 |
| Q3 | 871.25 |
| 95-th percentile | 1499 |
| Maximum | 1822 |
| Range | 1822 |
| Interquartile range (IQR) | 624.25 |
Descriptive statistics
| Standard deviation | 431.2562402 |
|---|---|
| Coefficient of variation (CV) | 0.7321059598 |
| Kurtosis | 0.007676090412 |
| Mean | 589.0626 |
| Median Absolute Deviation (MAD) | 282.5 |
| Skewness | 0.8414979558 |
| Sum | 5890626 |
| Variance | 185981.9447 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 330 | 28 | 0.3% | |
| 226 | 28 | 0.3% | |
| 341 | 27 | 0.3% | |
| 292 | 26 | 0.3% | |
| 303 | 26 | 0.3% | |
| 41 | 26 | 0.3% | |
| 349 | 26 | 0.3% | |
| 198 | 25 | 0.2% | |
| 343 | 24 | 0.2% | |
| 353 | 23 | 0.2% | |
| Other values (1549) | 9741 | 97.4% |
| Value | Count | Frequency (%) | |
| 0 | 7 | 0.1% | |
| 1 | 1 | < 0.1% | |
| 2 | 7 | 0.1% | |
| 3 | 5 | 0.1% | |
| 4 | 12 | 0.1% |
| Value | Count | Frequency (%) | |
| 1822 | 2 | < 0.1% | |
| 1820 | 1 | < 0.1% | |
| 1819 | 1 | < 0.1% | |
| 1818 | 3 | < 0.1% | |
| 1817 | 3 | < 0.1% |
profile_group
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 39.2 KiB |
| 1 | |
|---|---|
| 0 |
| Value | Count | Frequency (%) | |
| 1 | 5908 | 59.1% | |
| 0 | 4092 | 40.9% |
| Distinct count | 5 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 4539 |
| Missing (%) | 45.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.390770921076726 |
|---|---|
| Minimum | 0.0 |
| Maximum | 10.0 |
| Zeros | 884 |
| Zeros (%) | 8.8% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 2 |
| median | 5 |
| Q3 | 5 |
| 95-th percentile | 10 |
| Maximum | 10 |
| Range | 10 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 3.370419014 |
|---|---|
| Coefficient of variation (CV) | 0.7676144064 |
| Kurtosis | -0.8421199191 |
| Mean | 4.390770921 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 0.5427676456 |
| Sum | 23978 |
| Variance | 11.35972433 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 5 | 1625 | 16.2% | |
| 10 | 1163 | 11.6% | |
| 2 | 1144 | 11.4% | |
| 0 | 884 | 8.8% | |
| 3 | 645 | 6.5% | |
| (Missing) | 4539 | 45.4% |
| Value | Count | Frequency (%) | |
| 0 | 884 | 8.8% | |
| 2 | 1144 | 11.4% | |
| 3 | 645 | 6.5% | |
| 5 | 1625 | 16.2% | |
| 10 | 1163 | 11.6% |
| Value | Count | Frequency (%) | |
| 10 | 1163 | 11.6% | |
| 5 | 1625 | 16.2% | |
| 3 | 645 | 6.5% | |
| 2 | 1144 | 11.4% | |
| 0 | 884 | 8.8% |
| Distinct count | 5 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 4539 |
| Missing (%) | 45.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.821827504120124 |
|---|---|
| Minimum | 0.0 |
| Maximum | 20.0 |
| Zeros | 884 |
| Zeros (%) | 8.8% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 5 |
| median | 10 |
| Q3 | 10 |
| 95-th percentile | 20 |
| Maximum | 20 |
| Range | 20 |
| Interquartile range (IQR) | 5 |
Descriptive statistics
| Standard deviation | 5.134556455 |
|---|---|
| Coefficient of variation (CV) | 0.6564394897 |
| Kurtosis | 0.6140649377 |
| Mean | 7.821827504 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 0.5656070118 |
| Sum | 42715 |
| Variance | 26.36366999 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 10 | 2307 | 23.1% | |
| 5 | 1158 | 11.6% | |
| 0 | 884 | 8.8% | |
| 7 | 645 | 6.5% | |
| 20 | 467 | 4.7% | |
| (Missing) | 4539 | 45.4% |
| Value | Count | Frequency (%) | |
| 0 | 884 | 8.8% | |
| 5 | 1158 | 11.6% | |
| 7 | 645 | 6.5% | |
| 10 | 2307 | 23.1% | |
| 20 | 467 | 4.7% |
| Value | Count | Frequency (%) | |
| 20 | 467 | 4.7% | |
| 10 | 2307 | 23.1% | |
| 7 | 645 | 6.5% | |
| 5 | 1158 | 11.6% | |
| 0 | 884 | 8.8% |
| Distinct count | 5 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 4539 |
| Missing (%) | 45.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 158.05896355978757 |
|---|---|
| Minimum | 72.0 |
| Maximum | 240.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 78.2 KiB |
Quantile statistics
| Minimum | 72 |
|---|---|
| 5-th percentile | 72 |
| Q1 | 120 |
| median | 168 |
| Q3 | 168 |
| 95-th percentile | 240 |
| Maximum | 240 |
| Range | 168 |
| Interquartile range (IQR) | 48 |
Descriptive statistics
| Standard deviation | 51.21029153 |
|---|---|
| Coefficient of variation (CV) | 0.3239948585 |
| Kurtosis | -0.7745756285 |
| Mean | 158.0589636 |
| Median Absolute Deviation (MAD) | 48 |
| Skewness | 0.177781946 |
| Sum | 863160 |
| Variance | 2622.493959 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 168 | 2306 | 23.1% | |
| 120 | 1189 | 11.9% | |
| 240 | 1082 | 10.8% | |
| 72 | 478 | 4.8% | |
| 96 | 406 | 4.1% | |
| (Missing) | 4539 | 45.4% |
| Value | Count | Frequency (%) | |
| 72 | 478 | 4.8% | |
| 96 | 406 | 4.1% | |
| 120 | 1189 | 11.9% | |
| 168 | 2306 | 23.1% | |
| 240 | 1082 | 10.8% |
| Value | Count | Frequency (%) | |
| 240 | 1082 | 10.8% | |
| 168 | 2306 | 23.1% | |
| 120 | 1189 | 11.9% | |
| 96 | 406 | 4.1% | |
| 72 | 478 | 4.8% |
| Distinct count | 3 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 4539 |
| Missing (%) | 45.4% |
| Memory size | 78.2 KiB |
| bogo | |
|---|---|
| discount | |
| informational |
| Value | Count | Frequency (%) | |
| bogo | 2321 | 23.2% | |
| discount | 2256 | 22.6% | |
| informational | 884 | 8.8% | |
| (Missing) | 4539 | 45.4% |
Length
| Max length | 13 |
|---|---|
| Mean length | 5.2441 |
| Min length | 3 |
| Value | Count | Frequency (%) | |
| Lowercase_Letter | 15 | 100.0% |
| Value | Count | Frequency (%) | |
| Latin | 15 | 100.0% |
| Value | Count | Frequency (%) | |
| ASCII | 15 | 100.0% |
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 4539 |
| Missing (%) | 45.4% |
| Memory size | 78.2 KiB |
| 1 | |
|---|---|
| 0 | |
| (Missing) |
| Value | Count | Frequency (%) | |
| 1 | 4375 | 43.8% | |
| 0 | 1086 | 10.9% | |
| (Missing) | 4539 | 45.4% |
| Distinct count | 1 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 4539 |
| Missing (%) | 45.4% |
| Memory size | 78.2 KiB |
| 1 | |
|---|---|
| (Missing) |
| Value | Count | Frequency (%) | |
| 1 | 5461 | 54.6% | |
| (Missing) | 4539 | 45.4% |
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 4539 |
| Missing (%) | 45.4% |
| Memory size | 78.2 KiB |
| 1 | |
|---|---|
| 0 | 467 |
| (Missing) |
| Value | Count | Frequency (%) | |
| 1 | 4994 | 49.9% | |
| 0 | 467 | 4.7% | |
| (Missing) | 4539 | 45.4% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| df_index | person | event | time | offer_id | amount | reward_expected | gender | age | income | days_since_became_member | profile_group | reward | difficulty | duration | offer_type | web | mobile | social | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 182824 | a4f3350b08934d41a80e526317842b40 | offer received | 504 | f19421c1d4aa40978ebb69ca19b0e20d | NaN | NaN | M | 40 | 31000.0 | 1333 | 1 | 5.0 | 5.0 | 120.0 | bogo | 1.0 | 1.0 | 1.0 | 1.0 |
| 1 | 179320 | fc05227412c64551b11d24e6a07332c1 | transaction | 516 | NaN | 5.83 | NaN | M | 48 | 53000.0 | 318 | 1 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 2 | 184377 | e744b0bba57b4eca9247d37aa2bc932b | offer received | 168 | 2298d6c36e964ae4a3e7e9706d1fb8c2 | NaN | NaN | M | 52 | 106000.0 | 232 | 0 | 3.0 | 7.0 | 168.0 | discount | 1.0 | 1.0 | 1.0 | 1.0 |
| 3 | 216930 | 9ed20a4ecdf94cc99d52f6fd6a07b363 | offer viewed | 438 | ae264e3637204a6fb9bb56bc8210ddfd | NaN | NaN | F | 22 | 55000.0 | 766 | 1 | 10.0 | 10.0 | 168.0 | bogo | 0.0 | 1.0 | 1.0 | 1.0 |
| 4 | 158787 | ef8f68ebd37f4560ae141cb13317ef06 | offer received | 576 | ae264e3637204a6fb9bb56bc8210ddfd | NaN | NaN | F | 33 | 40000.0 | 358 | 1 | 10.0 | 10.0 | 168.0 | bogo | 0.0 | 1.0 | 1.0 | 1.0 |
| 5 | 237278 | 8296312262cd4cc19872e53ad1e10345 | transaction | 684 | NaN | 3.14 | NaN | M | 44 | 47000.0 | 397 | 1 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 6 | 223684 | 08eb126ad33f447ca3ad076482445c05 | transaction | 228 | NaN | 7.96 | NaN | M | 30 | 57000.0 | 1018 | 1 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 7 | 79068 | 2d0884ac790148f3a57e97a84eb3d302 | transaction | 324 | NaN | 9.62 | NaN | M | 34 | 41000.0 | 66 | 1 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 8 | 174117 | dcba56d7e54546f095f1e774662e60ba | transaction | 204 | NaN | 5.79 | NaN | M | 56 | 40000.0 | 1157 | 1 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 9 | 187929 | 5a1fba0d6d0e4cb183abc82b4ed586df | offer received | 0 | 3f207df678b143eea3cee63160fa8bed | NaN | NaN | M | 57 | 33000.0 | 556 | 1 | 0.0 | 0.0 | 96.0 | informational | 1.0 | 1.0 | 1.0 | 0.0 |
Last rows
| df_index | person | event | time | offer_id | amount | reward_expected | gender | age | income | days_since_became_member | profile_group | reward | difficulty | duration | offer_type | web | mobile | social | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 9990 | 219640 | 04c66db4925042a5bb103fec87225370 | transaction | 564 | NaN | 33.44 | NaN | M | 67 | 99000.0 | 358 | 0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 9991 | 75408 | 2b538cb81add42819dd88b12c42b023f | offer viewed | 528 | f19421c1d4aa40978ebb69ca19b0e20d | NaN | NaN | F | 59 | 78000.0 | 722 | 0 | 5.0 | 5.0 | 120.0 | bogo | 1.0 | 1.0 | 1.0 | 1.0 |
| 9992 | 264011 | f5d31009d87f411091be6b7d008564cf | transaction | 390 | NaN | 3.10 | NaN | F | 44 | 60000.0 | 275 | 1 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 9993 | 35441 | f2acd6b2a94a48efbcd6437406eb67b9 | offer received | 0 | fafdcd668e3743c1bb461111dcafc2a4 | NaN | NaN | F | 50 | 90000.0 | 322 | 0 | 2.0 | 10.0 | 240.0 | discount | 1.0 | 1.0 | 1.0 | 1.0 |
| 9994 | 240793 | e5f9566e11e740f5b61d9ca8802e13b8 | transaction | 534 | NaN | 5.67 | NaN | M | 48 | 59000.0 | 727 | 1 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 9995 | 144518 | 97681d53482d422f84214bf4e0a77b60 | offer viewed | 504 | 4d5c57ea9a6940dd891ad53e9dbe8da0 | NaN | NaN | M | 47 | 52000.0 | 679 | 1 | 10.0 | 10.0 | 120.0 | bogo | 1.0 | 1.0 | 1.0 | 1.0 |
| 9996 | 212037 | 6fb9bdf5ad7e43eea5133d3621cd1f4d | offer viewed | 516 | 2298d6c36e964ae4a3e7e9706d1fb8c2 | NaN | NaN | M | 20 | 33000.0 | 1281 | 1 | 3.0 | 7.0 | 168.0 | discount | 1.0 | 1.0 | 1.0 | 1.0 |
| 9997 | 154483 | 0750f076ecc945a99bcec8c00d1291e9 | offer viewed | 432 | 4d5c57ea9a6940dd891ad53e9dbe8da0 | NaN | NaN | F | 60 | 73000.0 | 666 | 0 | 10.0 | 10.0 | 120.0 | bogo | 1.0 | 1.0 | 1.0 | 1.0 |
| 9998 | 118836 | 6cbc092fb84a437f9a248420129ca1f2 | offer viewed | 342 | 9b98b8c7a33c4b65b9aebfe6a799e6d9 | NaN | NaN | F | 50 | 63000.0 | 40 | 1 | 5.0 | 5.0 | 168.0 | bogo | 1.0 | 1.0 | 1.0 | 0.0 |
| 9999 | 72709 | 18cc6dbe8e124c20b7bc93c39e854cdd | offer viewed | 414 | 0b1e1539f2cc45b7b9fa7c272da2e1d7 | NaN | NaN | M | 61 | 62000.0 | 348 | 1 | 5.0 | 20.0 | 240.0 | discount | 1.0 | 1.0 | 0.0 | 0.0 |